Visual Field Pointwise Analysis of the Idiopathic Intracranial Hypertension Weight Trial (IIH:WT)

Purpose This study was designed to determine if point analysis of the Humphrey visual field (HVF) is an effective outcome measure for people with idiopathic intracranial hypertension (IIH) compared with mean deviation (MD). Methods Using the IIH Weight Trial data, we performed a pointwise analysis of the numerical retinal sensitivity. We then defined a medically treated cohort as having MDs between −2 dB and −7 dB and calculated the number of points that would have the ability to change by 7 dB. Results The HVF 24-2 mean ± SD MD in the worse eye was −3.5 ± 1.1 dB (range, −2.0 to −6.4 dB). Total deviation demonstrated a preference for the peripheral and blind spot locations to be affected. Points between 0 dB and −10 dB demonstrated negligible ability to improve, compared with those between −10 dB and −25 dB. For the evaluation of the feasibility for a potential medical intervention trial, only 346 points were available for analysis between −10 dB and −25 dB bilaterally, compared with 4123 points in baseline sensitivities of 0 to −10 dB. Conclusions Patients with IIH have mildly affected baseline sensitivities in the visual field based on HVF analyzer findings, and the majority of points do not show substantial change over 24 months in the setting of a randomized clinical trial. Most patients with IIH who are eligible for a medical treatment trial generally have the mildest affected baseline sensitivities. In such patients, pointwise analysis offers no advantage over MD in detection of visual field change.


Introduction
Idiopathic intracranial hypertension (IIH) is characterized by raised intracranial pressure (ICP) associated, in most cases, with papilledema, visual field defects, and, in some cases, permanent visual loss. 1 Most people with IIH have moderate to severe headaches, systemic metabolic dysfunction, and central obesity. [2][3][4] The incidence of IIH is increasing around the world, commensurate with the increase in worldwide obesity. 5,6 Depending on the severity of IIH, patients can be treated with weight loss alone, ICP-lowering medications such as acetazolamide, ICP-lowering surgery, or a combination of these. The IIH Weight Trial (IIH:WT) showed that weight loss achieved by bariatric surgery resulted in long-term remission of ICP compared with a lifestyle weightmanagement intervention. 7 The Humphrey visual field (HVF) mean deviation (MD) has been used as an endpoint in IIH clinical trials. 8,9 However, although the IIH:WT met its primary endpoint (change in ICP measured by lumbar puncture opening pressure at 12 months), there was no significant improvement seen in the MD in either arm. We thus wondered if a different method-pointwise analysis-might be a more sensitive indicator of a change in the visual field in IIH patients participating in a treatment trial.
There are a number of different ways to evaluate visual field damage. [10][11][12][13] The MD determined by the HVF Analyzer (Carl Zeiss Meditec, Dublin, CA) is measured in decibels (dB) using a logarithmic scale and determines the average difference in visual field sensitivity compared with the mean sensitivity of a normal person of the same age. Weighting is inversely proportional to the expected variance at each location in a normal population, effectively giving more weight to the central locations. [14][15][16] A key regulator, the US Food and Drug Administration, considers a change of 7 dB in MD to be acceptable as being clinically meaningful. 17 In IIH, the expected MD change is smaller compared with other optic neuropathies such as glaucoma. For most patients in an IIH medical intervention trial, a 7-dB change would be unachievable, as the MD inclusion criteria would likely be between −2 dB and −7 dB, which would represent a floor effect.
Another functional endpoint that has been recommended for an optic neuropathy treatment trial is a change of 7 dB in five or more predefined reproducible visual locations. 17 Restricting an analysis to a particular subset of points in the visual field has not been previously prospectively investigated in IIH; however, the IIH Treatment Trial (IIHTT) investigators performed a post hoc pointwise analysis of the HVF. For each of the 52 points, a linear regression analysis was performed with the decibel measurement as the outcome variable and time as the independent variable. The IIHTT investigators demonstrated that peripheral points were more affected than central points. Although the magnitude of change in points was modest, there was significantly more improvement in the acetazolamide treatment arm. 10 Given the lack of correlation in the IIH:WT outcome measures and MD, we hypothesized that a pointwise analysis of the IIH:WT visual field data could potentially reveal localized improvements not demonstrated by the MD. The number of participants required and the number of points that could be predicted to change in an IIH trial population could be determined. The purpose of this study was to assess if point analysis of the HVF would be feasible in a cohort of people with active IIH in the setting of a randomized clinical trial.

Materials and Methods
IIH:WT was a prospective, multi-center, openlabel, parallel-group, controlled trial in which participants with IIH were randomized in a 1:1 ratio to a bariatric surgery pathway or the Weight Watchers program, a community weight management intervention (CWI). The study was approved by the Ethics Review Board of the National Research Ethics Committee West Midlands, and the Black Country approved IIH:WT (14/WM/0011). In accordance with the Declaration of Helsinki, all subjects gave written informed consent to participate in the study, and the detailed clinical trial methodology has been published. 18 Anonymized individual participant data will be made available along with the trial protocol and statistical analysis plan. Proposals should be made to the corresponding author and will be reviewed by the Birmingham Clinical Trials Unit Data Sharing Committee in discussion with the chief investigator. A formal data sharing agreement may be required after release of the data has been approved and before the data can be released.

Subjects
Women (18-55 years old) with a body mass index (BMI) > 35 kg/m2 were eligible if they had a clinical diagnosis of active IIH according to criteria outlined by Friedman et al. 19 All participants were recruited between March 2014 and October 2017. Evaluations were performed at baseline, 12 months, and 24 months. 18 The primary outcome was ICP as measured by lumbar puncture; secondary outcomes have been reported elsewhere. 7,18 At each visit, HVF with a 24-2 Swedish interactive threshold algorithm standard test pattern using a size III white stimulus was performed. HVFs were included for analysis if they were considered reliable as defined by less than 15% false-positive rates and 30% fixation losses and false-negative rates according to previous criteria. 20

Acquisition of Data From the Visual Fields
In this analysis, the raw values of the patient's retinal sensitivity at each of the HVF 24-2 predetermined points were extracted from pdf scans of the HVFs using a custom data extraction tool based on the Python 21 package "hvf extraction script." 22 This script used Google's tesseract optical character recognition 23 to distinguish text inside a digital image and return the relevant text in a useable format. Although the "hvf extraction script" was not originally intended for use on scanned documents, cleaning the images before processing gave values for the majority of the visual field locations. A manual validation of the total cohort point retinal sensitivity eliminated missing data and discrepancies between the original values and the data extraction tool.

HVF Analysis
To detect pointwise change over the course of the study, the points were categorized by individual pointwise retinal sensitivity at baseline. The mean change in sensitivity was plotted at each point from baseline to 12 and 24 months. Subsequently, the cohort was restricted to a population defined by a baseline MD between −2 dB and −7 dB to simulate a medically managed population. Finally, the number of points in the visual field in the whole cohort and in the restricted simulated medically treated cohort that would be expected to change per sensitivity category were calculated.

Statistical Analysis
Analysis of clinical data was based on the full dataset according to the statistical analysis plan. 7 In this evaluation, analyses were based on a per-protocol analysis. Statistical analysis was performed using R 3.6.3 (R Foundation for Statistical Computing, Vienna, Austria). Data were reported with mean and SD for normally distributed variables and median and range for data that were not normally distributed. Missing clinical data, due to any absence or choice, were excluded from the analysis.

Results
Characteristics of the study population are summarized in Table 1. Retinal sensitivity values for the whole cohort at baseline showed that the central points were less affected than the peripheral points (Fig. 1A). The whole cohort was then categorized according to the extent of their reduced visual function at baseline as per MD category (Figs. 1B-1D). As the visual function declined, the distribution of the average deviation points became increasing prominent in the periphery and around the blind spot.

Analysis of Pointwise Sensitivities in the Simulated Medically Managed Cohort
Those with a MD between −2 dB and −7 dB at baseline had a similar distribution of changes in the point-sensitive deviation at baseline (Fig. 2B). Overall, the vast majority of data points that were included were in the 0 to −10 dB category (n = 4123), compared with points between −10 dB and −25 dB (n = 346) and those between −25 dB and −5 dB (n = 487) ( Table 4).

Analysis of Pointwise Sensitivities in the Whole Cohort
The utility of baseline points between −0 dB and −10 dB was examined to establish how pointsensitivity analysis performed in IIH:WT. As expected, these demonstrated very little change at 12 and 24 months (at 12 months, the mean change was 0.4 ± 3.5 dB; at 24 months, the mean change was 0.48 ± 4.11 dB). Baseline sensitivities between −10 dB and −25 dB have the ability to change over time (Fig. 2). It was only when using the whole cohort that the largest mean changes were found: 8.53 ± 6.75 dB at 12 months and 9.60 ± 6.99 dB at 24 months (Table 3). However, there were fewer points available for analysis (n = 346 at 12 months and n = 329 at 24 months) compared with cases where the baseline pointwise sensitivity ranged from 0 to −10 dB (n = 4123 at 12 months and n = 1844 at 24 months) (Table 4). Furthermore, there was little difference observed between trial arms when analyzing the points that were between −10 dB and −25 dB at baseline, as the bariatric surgery arm was  8.75 ± 6.51 dB at 12 months and the CWI group was 8.16 ± 7.15 dB at 12 months. This was despite a significant difference between baseline and 12 months in the ICP of −6.0 cm cerebrospinal fluid (CSF) between the trial arms. Among those in the CWI arm who were not on acetazolamide, the point sensitivity mean change between baseline and 12 months was 8.03 ± 7.26 dB. Despite the significant reduction in ICP found between the bariatric surgery group and the CWI group, there was little discrimination analyzing point sensitivities among bariatric surgery, CWI, CWI with no acetazolamide treatment, and CWI with concurrent use of acetazolamide (Table 5).

Categorizing the Population by Baseline MD
To understand how representative a baseline point sensitivity beyond −10 dB in one or more points was in an active IIH population, we calculated the number of points ≥ −10 dB in each individual ( Table 6). In the whole cohort, the median number of points on the baseline visual field worse than −10 dB was 5 (interquartile range, 2-13), with 57% of the cohort having at least two points worse than −10 dB at baseline (Table 6). In the subgroup with a MD between −2 dB and −7 dB at baseline, 42% had more than five points that were worse than −10 dB. As the number of points required for analysis decreased, more participants were available for inclusion; for example, 73% had at least two or more points, and 85% had one point worse than −10 dB (Table 6). Also, 31% of patients at 12 months and 38% at 24 months would have achieved five or more points that improved by 7 dB (Table 7).

Discussion
In this study, we characterized the pointwise pattern of visual field change in a cohort of people with active IIH recruited to the IIH:WT. Those with baseline point sensitivities between 0 dB and −10 dB showed small changes over time and, as expected, were unlikely to demonstrate clinically meaningful change over both 12 and 24 months. Points in the −10 to −25 dB category demonstrated change that could be considered clinically meaningful (mean of 8.5 dB in at least one point in the whole visual field); however, using data between −10 dB and −25 dB resulted in fewer data points and larger SDs for analysis. Although the median number of points worse than −10 dB was five, 43% of all of the IIH:WT participants had fewer than two points worse than −10 dB at baseline, emphasizing that data points worse than −10 dB were not representative of the majority of IIH patients.
It should be emphasized that eligibility for the IIH:WT was not determined by MD criteria. Therefore, to simulate the HVF data to reflect a typically medically managed cohort, we chose a baseline HVF in which the MD was between −2 dB and −7 dB (the criterion range used in the IIHTT 9 ). Among this group,    (17) 42% had five or more points worse than −10 dB at baseline (Table 7). If only two points were required for analysis, 73% had two or more points worse than −10 dB in either eye at baseline (Table 6). Thus, we found that it would be challenging to use point analysis as an outcome for an interventional medical trial in IIH, as the pool of point-sensitivity data available for meaningful analysis would be extremely small. Additionally, the participants overall would be less representative of the whole disease spectrum, which could affect the applicability of the results being directly translatable to clinical practice. Finally, test locations with 8 to 18 dB of loss at baseline had a 95% prediction interval that nearly covered the full measurement range of the instrument (0-40 dB) 24 ; thus, the test-retest variability of these locations was so poor that there was little signal above the variability-related noise. 25 There is no universally adopted, minimally clinically important change in HVF measures in IIH as there are in glaucoma. 17,26,27 In glaucoma, visual field progression equal to or faster than −0.5 dB per year for at least five abnormal test locations at baseline has been found to be clinically significant, 28 as have changes from baseline beyond the 5% probability levels for the Glaucoma Change Probability analysis in five or more reproducible visual field locations. 29 Although pointwise analysis in patients with IIH has revealed changes around the blind spot and in the nasal area, likely reflecting the reduction in optic head nerve swelling as the papilledema resolves, 10 visual fields with global diffuse damage, such as occur in patients with IIH, tend to be more variable than fields with focal damage such in glaucoma. 30 The fundamental differences between these diseases confound the applicability of glaucoma outcome measures to IIH trials. IIH is a rare condition compared with glaucoma which immediately affects the trial design and recruitment potential, particularly as other tools that assess visual function, such as visual acuity (Snellen or logMAR), color vision, and contrast sensitivity, have not been found to be discriminatory in medically managed IIH. 9, 30 A limitation of this study is that it included only patients with well-established IIH. Thus, the results may not be applicable to patients with recently diagnosed IIH or to severely affected patients who may require urgent surgery. 6,31 In addition, because our cohort was small, it was subject to regression to the mean with respect to the mean deviation (Fig. 2). Regression to the mean is a common statistical phenomenon that occurs when repeated measurements are made on the same subject. Subjects would not be expected to have the same measurements at two different times due to measurement error and random fluctuation. Regression to the mean needs to be considered to distinguish a real change from the expected change due to the natural variation in test readings. To minimize regression to the mean, participants should be randomized to study arms, with a control arm being fundamental to the design of the trial. Variability can be further reduced by selecting participants using two or more baseline measurements, resulting in better estimates of the mean and the within-subject variation.
In this study and in studies reported by others, 10 the visual field deficit in IIH typically occurs across the full VF and increases with eccentricity. 13 Unfortunately, these are the very points that show the largest variability in visual field testing. 32,33 Visual field tests also have been found to be unreliable when visual field locations have sensitivity below 15 to 19 dB because of a reduction in the asymptotic maximum response probability. 34 In addition to test limitations, there are demonstrable changes in cognition in the domains of attention and executive function that have been found in patients with IIH and that directly affect the performance indicators in HVF testing. 35 Our data indicate that point analysis of the HVF has no advantage over global MD analysis, at least in the population we studied from the IIH:WT. As expected, baseline points that were better than −10 dB had little room to improve over time and, thus, offered little utility for analysis. If the generic threshold for a clinically meaningful change of 7 dB is recommended for IIH treatment trials, baseline points in the range of −10 to −25 dB would be needed for analysis. The US Food and Drug Administration has stated that visual field loss has likely occurred if ≥5 visual field locations have significant change beyond the 5% probability level or if there is at least a 7-dB betweengroup mean difference for the entire field. 17 A pointwise approach for people with IIH is not feasible as demonstrated here because there are too few data points available for analysis. Additionally, the points that could be used are known to be more variable. In our study, even when a limited threshold was set to determine a clinically meaningful change, point analysis did not offer an advantage over global MD. Consequently, point sensitivity analysis in medically treated IIH is likely to be prohibitive in clinical trials and not representative of the IIH disease spectrum. In addition, if the requirement of regulators for a meaningful change in MD is a 7-dB difference between trial arms, as recommended for glaucoma treatment trials, 17 then using MD as a primary outcome would not be achievable in medical IIH trials, as these typically recruit participants with MDs between −2 dB and −7 dB. Future studies may consider investigating the use of a larger stimulus size that has been demonstrated to retain the ability to detect defects, lower retest variability, and improve the useful dynamic range of the instrument. 36,37